{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "709dc2da-ce72-4039-8baf-e816225219f3",
   "metadata": {},
   "source": [
    "## 什么是DataFrame\n",
    "其实前面已经在用了，pd.read_csv() pd.read_table() 等，它们最终返回的都是 DataFrame。\n",
    "### 定义\n",
    "DataFrame = 带行索引和列标签的二维表格。\n",
    "### 一个 DataFrame 有三个核心部分：\n",
    "\n",
    "1. 数据（values）：存放在表格里的内容（可以是数值、字符串、布尔值等）。\n",
    "2. 行索引（index）：标识每一行。\n",
    "3. 列索引（columns）：标识每一列。"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "98dd91dc-cbb2-4e3b-8aab-26c4f5f9cffb",
   "metadata": {},
   "source": [
    "## 创建DataForm\n",
    "### 当你用一个 字典（dict） 来创建 DataFrame 时，字典的 key 会自动变成列名，而 字典的 value（必须是序列型，如 list、Series、ndarray）会变成这一列的数据。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "a966e26c-535c-4729-89b3-714d82006310",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Names</th>\n",
       "      <th>Age</th>\n",
       "      <th>Score</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>id1</th>\n",
       "      <td>Alice</td>\n",
       "      <td>25</td>\n",
       "      <td>88.5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>id2</th>\n",
       "      <td>Bob</td>\n",
       "      <td>30</td>\n",
       "      <td>92.3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>id3</th>\n",
       "      <td>Charlie</td>\n",
       "      <td>35</td>\n",
       "      <td>95.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "       Names  Age  Score\n",
       "id1    Alice   25   88.5\n",
       "id2      Bob   30   92.3\n",
       "id3  Charlie   35   95.0"
      ]
     },
     "execution_count": 1,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import pandas as pd\n",
    "\n",
    "data={\n",
    "    \"Names\":[\"Alice\", \"Bob\", \"Charlie\"],\n",
    "    \"Age\":[25,30,35],\n",
    "     \"Score\": [88.5, 92.3, 95.0]\n",
    "}\n",
    "df=pd.DataFrame(data,index=[\"id1\",\"id2\",\"id3\"])\n",
    "df"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "6b36d0ee-0b1e-4488-acdb-aa7c5e37a34d",
   "metadata": {},
   "source": [
    "## 另一个例子\n",
    "### data=[[1, 'a', 1.2], [2, 'b', 2.2], [3, 'c', 3.2]]\n",
    "### 这是一个二维列表，相当于表格的行数据：\n",
    "- 第 1 行：[1, 'a', 1.2]\n",
    "- 第 2 行：[2, 'b', 2.2]\n",
    "- 第 3 行：[3, 'c', 3.2]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "31bfeaac-d455-407f-8be9-9f57be57376e",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>col_0</th>\n",
       "      <th>col_1</th>\n",
       "      <th>col_2</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>row_0</th>\n",
       "      <td>1</td>\n",
       "      <td>a</td>\n",
       "      <td>1.2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>row_1</th>\n",
       "      <td>2</td>\n",
       "      <td>b</td>\n",
       "      <td>2.2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>row_2</th>\n",
       "      <td>3</td>\n",
       "      <td>c</td>\n",
       "      <td>3.2</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "       col_0 col_1  col_2\n",
       "row_0      1     a    1.2\n",
       "row_1      2     b    2.2\n",
       "row_2      3     c    3.2"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "data=[[1, 'a', 1.2], [2, 'b', 2.2], [3, 'c', 3.2]]\n",
    "df2=pd.DataFrame(\n",
    "    data=data,\n",
    "    index=[f'row_{i}' for i in range(3)], #用列表推导式生成行索引\n",
    "    columns=['col_0', 'col_1', 'col_2'] #指定列名\n",
    ")\n",
    "df2"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "394c2030",
   "metadata": {},
   "source": [
    "## 取一列，返回的就是series"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "710459cb-4cd5-48d3-b0ea-9b1f85c4ac27",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "row_0    1\n",
       "row_1    2\n",
       "row_2    3\n",
       "Name: col_0, dtype: int64"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df2['col_0']"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "940669bd",
   "metadata": {},
   "source": [
    "## 取多列，返回的就是DataFrame(每一列其实就是一个series)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "aa63112a",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>col_0</th>\n",
       "      <th>col_1</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>row_0</th>\n",
       "      <td>1</td>\n",
       "      <td>a</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>row_1</th>\n",
       "      <td>2</td>\n",
       "      <td>b</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>row_2</th>\n",
       "      <td>3</td>\n",
       "      <td>c</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "       col_0 col_1\n",
       "row_0      1     a\n",
       "row_1      2     b\n",
       "row_2      3     c"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df2[['col_0','col_1']]"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "0222630e",
   "metadata": {},
   "source": [
    "## 和Series类似， DataFrame中也可以通过 . 操作取属性"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "id": "24e35437",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[1, 'a', 1.2],\n",
       "       [2, 'b', 2.2],\n",
       "       [3, 'c', 3.2]], dtype=object)"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df2.values"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "id": "1ddb7df1",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Index(['row_0', 'row_1', 'row_2'], dtype='object')"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df2.index"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "id": "4932bb7e",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Index(['col_0', 'col_1', 'col_2'], dtype='object')"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df2.columns"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "afd116da",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "col_0      int64\n",
       "col_1     object\n",
       "col_2    float64\n",
       "dtype: object"
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df2.dtypes #返回的值是相应的列的数据类型"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "id": "9d7fe4e0",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(3, 3)"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df2.shape"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d5f4e8e3",
   "metadata": {},
   "source": [
    "## 使用 .T 可以转置，即让列变成行，请看对比"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "id": "068b48ac",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Names</th>\n",
       "      <th>Age</th>\n",
       "      <th>Score</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>id1</th>\n",
       "      <td>Alice</td>\n",
       "      <td>25</td>\n",
       "      <td>88.5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>id2</th>\n",
       "      <td>Bob</td>\n",
       "      <td>30</td>\n",
       "      <td>92.3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>id3</th>\n",
       "      <td>Charlie</td>\n",
       "      <td>35</td>\n",
       "      <td>95.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "       Names  Age  Score\n",
       "id1    Alice   25   88.5\n",
       "id2      Bob   30   92.3\n",
       "id3  Charlie   35   95.0"
      ]
     },
     "execution_count": 15,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "id": "614a90dc",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>id1</th>\n",
       "      <th>id2</th>\n",
       "      <th>id3</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>Names</th>\n",
       "      <td>Alice</td>\n",
       "      <td>Bob</td>\n",
       "      <td>Charlie</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Age</th>\n",
       "      <td>25</td>\n",
       "      <td>30</td>\n",
       "      <td>35</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Score</th>\n",
       "      <td>88.5</td>\n",
       "      <td>92.3</td>\n",
       "      <td>95.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "         id1   id2      id3\n",
       "Names  Alice   Bob  Charlie\n",
       "Age       25    30       35\n",
       "Score   88.5  92.3     95.0"
      ]
     },
     "execution_count": 16,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.T"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "01c2cd78",
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "base",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.13.5"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
